sub-gaussian distribution
3ce3bd7d63a2c9c81983cc8e9bd02ae5-Supplemental.pdf
We start by restating the setup in which our algorithm operates. The type of ICA considered in our work assumes the following generative model. There are dsources recorded T times forming the columns of S:= [s1,...,sT] Rd T whose components s1t,...,sdt are assumed non-Gaussian and independent. Without loss of generality, we assume that each source has zero-mean, unit variance, and finite and distinct kurtosis, a common assumption among kurtosis-based ICA methods [12]. The kurtosis of a random variable v is defined as kurt[v] = E (v E(v))4 / E (v E(v))2 2. Finally, sources are assumed to be mixed through a linear system, i.e., there exists a full rank mixing matrix, A Rd d, producing the d-dimensional mixture, xt, expressed as xt = Ast t {1,...,T} .
2a91de02871011d0090e662ffd6f2328-Supplemental-Conference.pdf
The structure of the appendix mainly follows the roadmap of the proof described in Section 4.4. In Appendix A, we define the characterizable population risk function in (31) to approximate the objective function. Also, some notations to simplify the analysis are introduced in Appendix A, and we recommend the readers to refer to Table 3 for the major notations used in the proofs. Instead, in this paper, we consider multi-layer cases and need to derive a lower bound for the Hessian matrix for all the layers. Instead, the input of the intermediate layer cannot be proved to be Gaussian but belong to sub-Gaussian distribution.